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ABSTRACT 

We present a Bayesian angular power spectrum and signal map inference engine which can be 
adapted to interferometric observations of anisotropies in the cosmic microwave background, 21 cm 
emission line mapping of galactic brightness fluctuations, or 21 cm absorption line mapping of neutral 
hydrogen in the dark ages. The method uses Gibbs sampling to generate a sampled representation of 
the angular power spectrum posterior and the posterior of signal maps given a set of measured visi- 
bilities in the uw-plane. We use a mock interferometric CMB observation to demonstrate the validity 
of this method in the flat-sky approximation when adapted to take into account arbitrary coverage 
of the uu-plane, mode-mode correlations due to observations on a finite patch, and heteroschedastic 
visibility errors. The computational requirements scale as O(n p logn p ) where n p measures the ra- 
tio of the size of the detector array to the inter-detector spacing, meaning that Gibbs sampling is a 
promising technique for meeting the data analysis requirements of future cosmology missions. 
Subject headings: cosmology:observations, cosmic microwave background, instrumenta- 
tiominterferometers, methods: data analysis, methods: statistical 



1. INTRODUCTION 

Interferometric techniques have several advantages 
compared to single-dish observations: for high resolu- 
tion experiments they perform the same work of an op- 
tical system but with large potential weight and size re- 
ductions, they are naturally differential and hence well- 
adapted to measurements of small anisotropies superim- 
posed on a large isotropic background, their noise char- 
acteristics are highly uncorrelated to a good approxima- 
tion, and they are well-suited for observations of statis- 
tically isotropic anisotropies, in which the si gnal corre- 
lations natu rally decouple in Fourier space (Thompson 
et al. 2001). Additionally, a revolution in the devel- 
opment ot efficient digital correlator technology, lead- 



ing to massive reductions in power requirements (Par- 
sons et al.|2008 ), has brought interferometric approaches 
back to the list of contenders for strategies for a future 
spac e-based cosmic microwav e background (CMB) mis- 
sion ( |Timbie fc T^icker1[2009| . 
Already, interferometers have made great strides 



Sievers fc CBI Collaboration 2005| |Sievers et al.||2009 l 
and observed the Sunyeav-Zel do vich excess at small an- 
gular scales (Mason et al. 2003 ). This excess ha s also 
been observed by the SZ Array ( Sharp et al.||2010 ). 

Instruments of the next generation ot interferometers 
are already being deployed or under active development. 
For example, th e Precision Array for Probing the Epoch 
of Reionization (Parsons et al. 2010) and the Murchison 
Widefield Array (Lonsdale et al. 2009]) will begin prob- 
ing th e 21 cm regime (Furlanetto et al. 2006 Lidz et al. 
20081), whi le the Australian SKA Pathfinde r (fjohnston 
et al. 2008) and the Karoo Array Telescope ( |Booth et a 
2009| ) wiTTstudy a variety of radio sources. The planned 
Square Kilometer Array (SKA; |Jarvis||2007[) will also 
be sensitive to high-£ CMB anisotropies (jSubrahmanyan 



m measuring CM B anisotropies (White et al. 1999[ 
Halverson et al.||2002[ ) and polarization (|Myers et al. 
e Anj 



2006p . The Degree Angular Scale Interferometer ob- 
served anisotropies in the ran ge I — 100 — 900 (Halverson 



et al. 



2002 Leitch et al. 2002 ) an d was the first to di scover 



CIVlB polarization anisotropies ( Kovac et al.|2 002). The 
Very Small Array has obse rved the CMB a t 33 GH z to an 
angula r scale of I = 1400 ( |Grainge et al.|20"03 Dickinson 



et al. 



2004b . Sensitive to the extremely small scales of 
t ~ 4()00 with a resolution of 3 — 10 arcminutes, the Cos- 
mic Background Imager (CBI; Pea rson et al. 2000) has 



produced a wealth of information"! Readhcad et al.|200"4 
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& Ekers 2002 1. For any science case, the large fields ot 
view, high resolution, and the large number of antennas 
pose significant computational challenges to current data 
analysis methods. 

Current methods for extracting angular power spec- 
trum estimates from obser vations, such as maximum like- 
lihood and pseudo-C| (see Tristram & Ganga 2007 for a 
review) , ha ve already been applied to inter feromet ric ob- 
servations ( Hobson fc Maisinger|2002| Myers et al.|2003 l 
and to more complicated anisotropy and polarizatio n ob- 
serving stra tegies such as drifting and mosaicking (Park 
et al. 2003). However, these strategies are computation- 
ally expensive and scale poorly — often 0(n p ) — with 
the data size n p . Recognizing this problem in observa- 
tions from bolometer-based imaging instruments, there 
have been many efforts to improve the scaling and ef- 
ficiency of data anal ysis algorithms, such a s applying 
massive parallelis m (Cantalupo et al. |2010 ), ada ptive 
sampling at \ow-l (Bcnabcd et al. 2009), wavelets (Fay 



2 



ct al. 2008), and Gibbs sampling techniques (Wandelt 
et al.||2004[ ). However, these approaches have not been 
extended to interferometers. 

Gibbs sampling is especially promising since it allows 
a full exploration of the joint posterior density of the 
angular power spectrum and signal reconstruction in 
0(n p log n p ) operations. This contrasts with the 0(n p ) 
scaling for any method that requires evaluations of the 
likelihood. Only Hamiltonian sampling has been demon- 



strated to achieve similar scaling behavior ( Taylor et al 



2008a) 

In the frequentist context, maximum likelihood esti- 
mation also scales as 0(11^) and is therefore not fea- 
sible for the size of current data sets. Faster estima- 
tors ha ve been develop ed: the optimal quadratic esti- 



mator (Tegmark 



19971, which requires only Q( n p ) op- 



erations, or pseudo-power spectrum estimators dH auser 
fc Peebles||l973l IWandelt et al.|[200T| |Hivon et aT 



Szapudi et al. 112001 ) , which are designed to approximate 



this quadratic estimator while costing only 0(n p - 5 ) op- 
erations. These tremendous cost savings have rendered 
the pseudo-power spectrum estimators popular choices, 
although it is unclear how to construct controlled approx- 
imations of the likelihood starting from such an estima- 
tor, especially in the regime of a small number of modes 



where a Gauss ian approximation may not hold ( Eisner & 
Wandelt|2012 ). Such a likelihood is necessary to perform 



the Markov Chain Monte Carlos with the Bayesian anal- 
ysis toolkits that have become standard in cosmological 
analysis. In our approach the parameter analysis could 
be performed directly in the context of the power spec- 
tr um analysis (see the discussion around Equation (21) 
Wandelt et al.|2004 ) or a controlled, convergent likeli- 



hood approxima tion could be bu ilt from the Gibbs sam- 
ples themselves ( Chu et aL||2005[) . 
Pioneered in the c ontext of CMB observa tions by | Jew- 



ell et al.| ( |2004[ ) and |Wandelt et aL] ( |2004| ), Gibbs sam- 
pling has found many applications, such as Wilkinson 
Microwave A nisotropy Probe (WMAP) data analysis in 
temperature J O 'D wyer et al.|2004"l|Dickinson et al.l2009l 



Larson et al. 2011 )' and polarization ([Larson et al.||20U7 - 
Eriksen et al ||20fJ7 Komatsu et al.pOlip , and has been 



extended to se arches in low signai-to-noise regimes ( Jew- 
ell et al. 2009). Recently Gibbs sampling has also been 
used successfully in combination with other sampling 
techniques to reconstruct large-scale structure density 
fields and power spectra from galaxy catalogs, modeled 



log-normal signal ( 


Kitaura & Enfilin 2008! IJasche et al. 


2010 


Kitaura et a 


.||2010| Jasche & Wandelt||2012). We 



servations beyond the CMB in Section [5j 

The technique of Gibbs sampling is applicable even 
withou t assuming that the data are Gaussian. For ex- 



ample, Sutton & Wandelt ( |2006[ ) discuss a general ap- 
proach to Bayesian image reconstruction from interfero- 
metric observations using a fluxon model. Conceptually, 
the Gibbs sampling approach applied to Gaussian ran- 
dom fields greatly clarifies the inter-relationship between 
optimal, minimum-variance filtering and angular power 
spectrum estimation. Essentially, Gibbs sampling can 
be understood as a non-l inear Wiener filter (as discussed 



serving as a minimum variance summary of what can be 
inferred about the signal without assuming anything a 
priori. We will return to this subject in Section [5] 

As such, Gibbs sampling neatly resolves an ambiguity 
in the astronomical data analysis literature: there are 
many recipes, but no clear guidance, for choosing the 
signa l covariance for the Wiener filter (Rybicki & Press 
1992|. The Bayesian approach implemented through 
Gibbs sampling gives a principled and self-consistent way 
to do angular power spectrum inference and signal recon- 
struction at the same time, including full propagation of 
the uncertainties. 

In this work we develop such a fully Bayesian analysis 
as applied to an observation of the CMB with a simple 
interferometer, including effects such as incomplete uv- 
plane coverage, a realistic noise model, and a finite beam 
size. Likelihood analysis for realis tic interferometer data 
has been discussed elsewhere (e.g., Myers et al. 2003 ), but 
our analysis is novel in the following ways: (1) we explore 
the joint posterior of signal and power spectra; (2) our 
analysis explores the full posterior shape; (3) it automat- 
ically and fully takes into account correlations between 
different points in the itu-plane due to the presence of 
the primary beam and noise; (4) it does so by generating 
a sampled representation which makes marginalization 
trivial (as opposed to calculating posterior or likelihood 
slices); (5) we propose a summary of the signal posterior 
in terms of the posterior mean, which results in an op- 
timal signal reconstruction (Wiener filtering) without a 
priori specification of the signal covariance; and (6) we 
show how one can build up detailed error estimates for 
the reconstruction from the posterior samples. 

The analysis reduces to a likelihood analysis for the 
choice of flat power spectrum priors, in which case 
the word posterior should be replaced by likelihood 
throughout the paper. While our example application 
of the technique is simplified — but nonetheless nearly 
realistic — it demonstrates the validity of the technique 
by providing a concrete example and opens the way for 
future improvements and applications. Section[2]outlines 
the method of Gibbs sampling as applied to interferomet- 
ric observations. Next we discuss our simulated interfer- 
ometer observation setup in Section [3] and present angu- 
lar power spectrum estimates and signal reconstructions 
in Section H] Finally, Section [5] summarizes our main 
findings and discusses the applicability of this method 
beyond the CMB. 

2. METHOD OF GIBBS SAMPLING 

We model the visibility data d obtained from an inter- 
ferometric observation as 



d = IF A 



I n, 



(1) 



in Wandelt et al. 20041 with the posterior mean map 



where s is a vector containing a discretization of the true 
sky, A is a primary beam pattern, F is a Fourier trans- 
form operator which converts from pixel space to the 
uw-plane, / is an interferometer pattern in the itv-planc, 
and n is a Gaussian realization of the noise. We discuss 
in more detail the construction of these elements below 
in Section[3] We discretize the signal s, data d, an d noise 
n with n p el ements, using a technique similar to Myers 
et al. ( 2003[ ), and we assume a flat-sky approximation 
throughout. 
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Gibbs sa mpling has been ext ensively discussed else- 
where (e.g., Wandelt et al.| [2004), so we only briefly dis- 
cuss the relevant equations as applied to interferometric 
observations here. We begin with some initial guess of 
the angular power spectrum C° and progressively iterate 
samples from the conditional distributions 



s i+1 <- P(s\C},m) 
&+ l <- P(C e \s* +1 ), 



(2) 
(3) 



where m is the least squares estimate of the signal s 
given the data d (i.e., A T N~ 1 Am = A T N^ 1 d). The 
samples (C\ : s l ) converge to samples from the joint dis- 
tribution P(C e , s, m) = P{m\s)P(s\Ct)P(Ci) after a suf- 
ficient number of iterations. 

Given a angular power spectrum sample C\, we gen- 
erate a new signal sample by drawing from a multivari- 
ate Gaussian with mean S l (S l + N)~ lr m and variance 
((S') -1 + N- 1 )- 1 . Here S and N are the signal and 
noise covariance, respectively. We do this by solving the 
set of equations 

M s l+1 = A T F- 1 I{INI)- 1 d 

+ F -i 5 -i/2 F6 (4) 

+ A T F- 1 I{INI)- 1/2 F £>, 



where wc define the matrix operator M as 



M = F- l S~ l F + A T F- 1 I{INI)- L IFA 



(5) 



In the above equations A T is the beam transpose and 
F^ 1 is the inverse Fourier transform. The first term in 
the right-hand side of the above equation provides the so- 
lution for the Wiener-filtered map, while the second and 
third terms of Equation Q provide random fluctuations 
with the required variance. The vectors £i and £2 are of 
length n p with elements drawn from a standard normal 
distribution. 

The signal covariance matrix S is diagonal in the uv- 
plane for isotropic signals, so Se,e' = where 
£ = 2ttu, with u being the radial distance in the wu-plane. 
Here and throughout we assume the flat-sky approxima- 
tion that makes this identity possible. By construction, 
Gibbs sampling explores the exact posterior and there- 
fore treats the c ouplings introduced by partial sky cov- 
erage optimally (Wandelt et al. 2004 1. The algorithm 
"knows" about the couplings since they are contained in 
the data model (Equation IT]) which underlies the anal- 
ysis. However, to a very good a pproximation the noise 
covariance matrix N is diagonal (W hite et al.|1999 ), and 
thus we will assume this for simplicity. We assign to 
the matrix N entries equal to Nij = crfSij, where cr, is 
the noise variance for the ith pixel in the uw-plane. The 
construction I(INI)^ 1 provides a pseudo-inverse of N, 
so that any locations in the uw-plane with no antenna 
coverage do not yield infinities when taking the inverse. 

We solve numerically the above matrix- vector eq uation 
using a pre conditioned conjugate-gradient scheme (Press 
et al.|[l986 ). The preconditioner approximates the diag- 
onal components of M and is 



P- 1 = F^IilNiy^JF (F^A 2 



(6) 



Given the latest signal sample, s 4 , we generate a new 
angular power spectrum sample from Equation ([3]) by 
computing the variance irf in annuli of constant i on the 
Fourier-transformed signal. We then use this variance to 
draw from the probability density P(Cg\s l \ which fol- 
lows an inverse Gamma distribution, by creating a vector 
pe of length ri£ (assuming a Jeffreys' ignorance prior) and 
unit Gaussian random elements. Here, rig is the number 
of pixels in the bin £. The next power spectrum sample 
is then simply 



\Pi\ 



2 ■ 



(7) 



In the above, we assume 1(1 + l)Cg to be constant 
across the width of each annulus in the uu-plane. The 
width of each annulus can be set as desired. For the ex- 
ample that we show in this paper we chose the width to 
be 8ir/L, where L is the longest baseline of the interfer- 
ometer. This is four times the uv-space resolution. This 
choice limits correlations between angular power spec- 
trum bins which develop as a consequence of partial sky 
coverage. All ^-bins have uniform width except for the 
first, which we restrict to cover only the central zone 
where we enforce Cq = 0, since our analysis cannot con- 
strain the DC mode. We wish to capture as much power 
spectrum information as possible, so we correspondingly 
widen the width of the second bin to close this gap. 

To determine convergence so that our iterative sam- 
ples from the conditional distributions (Equation [3]) are 
indeed samples from the joint distribution, we employ 
multiple chains with different random number seeds. Our 
convergence criterion is the Gelman-Rubin (G-R) statis- 
tic, which compares the variance among chains to the 
variance within each chain. The G-R statistic asymp- 
totes to unity, so convergence is said to be achieved when 
this statistic is less than a given tolerance for each l- 
bin ( jGelman fc Rubin|[l992l ) . 



3. SIMULATIONS 

We begin with a realization of the CMB sky on a square 
patch of side 20 deg at 30 GHz based on a CAMB- 



produced angular power spectrum (Lewis et al. 20001 



with cosmological parameters consistent with W MAP 



seven -year results (Larson et al. 2011 
2TTiT) ) of n M = 27, ii A = 73, Ji 
Ho = 70km s" 1 



Komatsu et al 
0.045, and 



where A is the Fourier transform of the beam pattern. 



10 = fUKin s - Mpc~ . We discretize this signal map 
and corresponding ww-plane with 256 pixels per side, 
which gives a spatial resolution of 4.7 arcmin and a uv- 
plane resolution of 2.86 A. While this size of patch vi- 
olates the strict flat-sky approximation, it allows us to 
explore the validity of our technique at higher resolu- 
tions and probe the range of scale s accessible to reali stic 
interferometers, such as the CBI ( |Mason e"t~a l. 2003). 

We model the primary beam pattern A as a Gaussian 
with peak value of unity and standard deviation 1.5 deg. 
With these parameters the beam decreases to a value of 
10~ 3 halfway to the edge of the box. This allows us to 
include all Fourier modes up to the Nyquist frequency 
in our analysis and ensures that the periodic boundary 
conditions inherent in the Fourier transform do not cause 
unwanted edge-effects. 

We assemble the interferometer array in a simple way 
by randomly placing 20 antennas and selecting all base- 
line pairs within the uu-plane. We then allow the assem- 
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Fig. 1. — Coverage pattern (black) in the uu-plane. This pattern 
results from selecting all baseline pairs from 20 randomly-placed 
antennas pointed at a celestial pole. The pattern is then rotated 
uniformly on the same patch of sky for 12 hr. 
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Fig. 2. — Antenna coverage for each £-b'm. Shown is the percent- 
age of cells within the annulus defined by the radius I intersected 
by any antenna during the 12 hr observation period. 



bly to rotate uniformly for 12 hr while observing the same 
sky patch at a celestial pole. We construct the interfer- 
ometer pattern / by placing a value of one wherever a 
baseline length intersects a pixel during its rotation and 
zeros elsewhere. We show the resulting itv-plane cover- 
age in Figure[l] This configuration covers roughly 70% of 
the ify-plane, although the coverage varies significantly 
for each £-bin, as Figure [2] shows. Some bins, especially 
at very low and very high^ have zero coverage due to the 
lack of baselines at that distance. However, even if these 
bins had adequate coverage, we expect statistics here to 
be relatively poor due to the reduced number of modes 
in these regions. Most bins have at least 60% coverage 
and several bins have complete coverage. 



We determine the noise per pixel by summing the in- 
tegration time spent in that pixel by all baselines. We 
do not adopt a noise model for a particular instrument; 
rather, we set the noise variance to be of cx l/t bs,i- We 
then set an overall signal-to-noise ratio of 10 by multi- 
plying all noise variances by a constant value to main- 
tain \IFA s\/\In\ = 10. This provides a scaling of the 
noise that would normally be caused by instrument ef- 
fects such as the surface area of the detectors and the 
system temperature in a realistic observation. We use 
a Gaussian realization of this noise to create the data 
in Equation 0. We show the process going from input 
signal to data d in Figure [3| 

3.1. Computational Considerations and Correlation 
between Samples 

The computational complexity of the Gibbs sampling 
algorithm as applied to single-dish experi ments has been 
discu ssed in detail in previous works (e.g., Wandelt et al.| 



2004 1 where it was shown that the scaling is dominated 
by spherical harmonic transforms, which scale as 0(£ 3 ). 
Since we are dealing with small patches on the sky, 
assume that all horns have identical beams, and dis- 
cretize the uu-plane, the analysis time is dominated by 
two-dimensional fast Fourier transforms, which scale as 
0{e \og£). 

The scaling estimate concerns the time to compute in- 
dividual samples. The total time required to perform 
a full analysis also depends on the number of samples 
required for a satisfactory exploration of the Bayesian 
posterior density. For single-dish experiments the cor- 
relation length for individual Cg bins can become large 
for high I where many modes with individually low 
signal-to-noise combine together. This can lead to very 
long convergence times and limited scalability, which re- 
quires special tricks for the analysis of low si gnal-to-noise 
regimes as described in Jewell et al. (2009). As we will 



see, however, compact antenna arrays result in an ap- 
proximately uniform coverage of the uv -plane. The cor- 
relation length is a function of coverage and therefore we 
obtain approximately uniform signal-to-noise as a func- 
tion of I which means the correlation function will be a 
weak function of I. 

We ran four independent Gibbs sampler chains for 2000 
iterations each. We discarded the first 1000 iterations 
for a "burn-in" phase. After this number of iterations, 
the G-R statistic reached less than 1.1 for all bins. This 
analysis took roughly 75 hr on 4 cores of an Intel dual six- 
core X5650 Westmere 2.66GHz machine and consumed 
only 10 MB of memory. We show in Figure [4] the number 
of steps required to satisfy our G-R convergence criterion 
versus the sky coverage percentage for each l-\ym. Shown 
are the number of steps after the burn-in phase. 

We see little correlation between uw-plane coverage 
and time to convergence. As we will see in the power 
spectrum results below, reduced coverage does lead to a 
variable correlation length (larger for small power and 
shorter for large power). This means that some low- 
coverage bins dominate our convergence rate and hence 
overall performance. This is a consequence of the ratio 
of the siz e of the Gibbs mov e to the full width of the 
posterior (Er iksen et al.|[2004 ). 

Large correlation lengths, which lead to longer conver- 
gence times, do not have a large impact on the feasibility 
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Fig. 3. — Observation-making process. Shown are (a) the input sky signal s, (b) the application of a primary beam, which we model as 
a Gaussian with standard deviation 1.5 deg, (c) Fourier transformation onto the tre-plane, (d) application of 20 randomly placed antennas 
rotated uniformly for 12 hr, and (e) the addition of the noise. Note that all images in the uv-plane are shown as magnitudes. 

550 



of the example analysis we present here; however, they 
would be more important in the limit where detector 
noise is large compared to the observed signal (e.g., if the 
goal was to place upper limits on an as-yet unobserved 
signal, as is currently the case with inflationary B-mode 
missions). This issue will therefore have to be revisited 
for polarization data. In any case, if long correlation 
lengths in the Markov chain were to cause poor perfor- 
mance, we c ould follow the general prescription of Jewell| 
et al. ( 2009[ ) by incorporating a step with a Metropolis- 
Hastings sampler and deterministic rescaling of the sky 
signal. 



4. RESULTS 



4.1. 



Power Spectrum 

In Figure [5] we show the mean posterior angular power 
spectrum of the of the four independent chains after 
reaching convergence. We also show the uncertainty as- 
sociated with each £-bin and the corresponding power of 
our input signal realization. The size of the uncertainties 
in each bin are consistent with varying coverage in the 
uu-plane; i.e., those bins with little to no coverage have 
correspondingly large error bars, while those bins with 
complete coverage have the tightest constraints. All of 
our estimates fall within 2a of the expected value, and 
most within la, as expected with this number of bins. 

Figure [6] shows individual marginalized probability 
densities for a selection of £-bin. We immediately note 
the shape of each probability distribution matches qual- 
itatively that of an inverse gamma distribution, as ex- 
pected. We see the damaging effects of the lack of uv- 
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Fig. 4. — Number of steps after burn-in required to satisfy the 
G-R convergence criterion versus the sky coverage percentage for 
each £-bin. 

plane coverage especially in the case of I ~ 300. Here, the 
lack of any input signal leads to a uniformly decreasing 
posterior probability density function (pdf) with a very 
long power-law tail to large power. This can be summa- 
rized as an upper bound on the power in that bin. In 
other bins, slightly reduced coverage combined with the 
effects of noise yields wide and noisy distributions. 

Beyond the marginalized posteriors for each angular 
power spectrum bin amplitude, the posterior samples 
also contain higher-order information. Figure [7] shows 
the two-point correlations between all pairs of angular 
power spectrum bins. For this plot, we suppress the di- 
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Fig. 5. — Mean posterior angular power spectrum for each l- 
bin (black) with la (dark gray) and 2cr (light gray) uncertainties. 
The binned input CMB angular power spectrum is shown in blue 
and the angular power spectrum of our specific signal realization 
is shown in pink. 

agonal components so that we can more easily examine 
the cross-bin correlations. While the correlation matrix 
is somewhat noisy, we immediately see the effects of the 
reduced sky coverage due to a finite beam width as cor- 
relations between adjacent modes. This beam leads to 
reduced Fourier space resolution. The correlation data 
are informative about regions of the signal data d which 
are larger than a single angular power spectrum bin. 
This slight oversampling of the power leads to a ten- 
dency of bins to be anti-correlated with each other. Anti- 
correlation occurs because data constrain power in re- 
gions of the signal and if a particular bin scatters high, 
the inference in adjacent bins reduces in power to r emain 
consistent with the data ( Eisner fc WandeTt||2012 ). 



4.2. Signal Reconstruction 

Figure [8] shows several sample posterior maps (i.e., the 
solution s l+1 to Equation^]). We show the signal samples 
at steps and 1000 after the "burn-in" phase. The Gibbs 
sampler fills in the map outside of the primary beam with 
fluctuations consistent with the actual data. While maps 
made from interferometric data are often accompanied by 
an "effective beam", each signal sample presented here 
has no smoothing. The signal samples are a population 
of possible pure signal skies which are consistent with the 
data. 

We can also easily compute Wiener-filtered maps: set- 
ting the fluctuation vectors £i = £2 = in Equation Q 
and solving f or provides the definition of a Wiener- 
filtered map ( |Wandelt et al. [2004[ ). We show these maps 
for iterations and 1000 also in Figure[8l The Wiener fil- 
ter adaptively smooths the map depending on the data 
support. In this case, the filter eliminates the fluctu- 
ations outside of the primary beam, where there is no 
data, and successfully recovers the signal in the observed 
region. 

After sufficient iterations the artificially created fluctu- 
ations average out and we are left with a reliable recon- 
struction of the input signal within the area of the pri- 
mary beam, which we show in Figure [9] We accompany 
this final mean posterior signal map with the "dirty" 
map, which is F~ 1 (IAF s + In). Note that the dirty 



map contains artifacts such as grating rings and sidelobes 
which are absent from our posterior map. Also, the dirty 
map completely misses the strongest fluctuations. 

We show the difference between the final mean poste- 
rior signal map to the input map in Figure |10| We see 
that medium-scale fluctuations remain; however, the am- 
plitude of these fluctuations are below the typical scale of 
fluctuations in the input signal and are consistent with 
the uncertainties in the median-^ reconstructed power 
spectrum (Figure pi), which are caused by incomplete uv- 
plane coverage. Outside the beam our final posterior map 
is nearly zero (as desired), leading to a reconstruction of 
the input map in the residual. 

5. CONCLUSION 

We have successfully demonstrated the technique of 
Gibbs sampling as applied to interferometric observa- 
tions. Our approach accounts for realistic interferometer 
features, such as baseline-dependent noise, mode cou- 
pling due to a finite beam size, and incomplete cover- 
age of the w-plane. We have presented an example of 
CMB angular power spectrum estimation and signal re- 
construction of a moderately large (n p = 256 2 ) mock 
data set directly applicable to current and near-future 
missions. 

A complete interferometric analysis tool for a realistic 
future mission must include several features not studied 
here: 



2002 



multiple fre quencies, polarization ( Chiueh fc Ma| 
mosaicking ([Bunn fc White|2007[ ) , a true sph erical 
sky Bunn fc W hite 20051), and wide band widths ( |Sub-| 



rahmanyan 2004|). Building a complete pipeline is be 
yond the scope of the present study which focused on 
the ability of Gibbs sampling to explore the full shape 
of the joint, multivariate, non-Gaussian posterior pdf for 
the angular power spectrum and the signal map given 
the data. 

Within the Bayesian framework, strictly speaking, an- 
gular power spectrum estimation only makes sense if 
Gaussianity and statistical isotropy is assumed. An 
isotropic Gaussian process prior is a highly accurate 
model of the CMB anisotropics and therefore the detailed 
shape of the derived angular power spectrum posterior is 
of great physical interest for that application. 

While we have focused our example on CMB observa- 
tions, our basic technique can be generalized to a wide va- 
riety of interferometer observations, such as the planned 
stud y of the 21 cm signal fr om the epoch of reioniza- 
tion (|Morales fc Hewitt 2004), which ca n provide useful 



const raints on cosmologlcal parameters (McQuinn et al. 
2006 1. In applications to non-CMB data sets tne Gaus- 



sian process prior may not strictly hold, but can be mo- 
tivated from a maximum entropy argument: given only a 
model of the mean and covariance (defined by the angu- 
lar power spectrum) of a data set, the least informative 
completion to a full probabilistic model yields a Gaussian 
process prior. In that case the method still maps out the 
angular power spectrum likelihood taking into account 
all modeled signals and imperfections in the data at the 
two-point correlation level. 

In the derivation of the Gibbs sampling method using 
the conditional densities of the posterior distribution, the 
signal angular power spectrum samples are paired with 
optimal (minimum-variance) signal reconstructions as- 
suming that same power spectrum. Averaged along the 
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given redshift slice. To tackle such a difficult analy- 
sis, our Gibbs sampling approach must be extended to 
deal with the sig nificant galactic and extragalactic fore- 
grounds present (|Santos et al. |2005l |Furlanetto et~al 



2006 



Bernardi et al.||2009| |Liu fe Tegmark||2012[ ) and to 
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Fig. 7. — Power spectrum correlation matrix. Shown are the cor- 
relations between all pairs of angular power spectrum bins. Diago- 
nal elements are suppressed to enhance the visibility of off-diagonal 
correlations. This matrix highlights the mode coupling due to the 
beam as correlations between adjacent bins. We also see slight 
(anti-)corrclations of adjacent angular power spectrum bins due to 
discrete sampling of the angular power spectrum. 



chain the algorithm therefore simultaneously discovers 
the correlation structure in the data, reconstructs opti- 
mal sky maps, and explores the range of uncertainty left 
by having finite amounts of imperfect data. The mean 
posterior signal reconstruction can be viewed as a non- 
linear generalization of the least-squares optimal signal 
reconstruction provided by the Wiener filter without re- 
quiring an a priori choice of the signal covariance. 

In any case, the assumption of statistical isotropy in 
the covariance model still holds in a standard FRW 
cosmology even for applications to 21 cm data on any 



include in the solution the full three spatial dimensions of 
the data and frequency dependence of the angular power 
spectrum. 

By extending the Gibbs sampling framework to inter- 
ferometric observations, we have demonstrated the va- 
lidity of this method in this regime and its power in 
providing computationally efficient 0(n p log n p ) angular 
power spectrum analysis and signal reconstruction. Our 
method is able to fully explore the angular power spec- 
trum likelihood shape in an optimal fashion. However, 
even with this scaling extremely large data sets can still 
take weeks to analyze with a single machine. Fortunately 
there are many opportunities for parallelism. For exam- 
ple, independent chains can be run as separate execution 
threads in a straightforward manner. Also, we dev eloped 
our code w i th th e open -source PETSc library (Balay 



et~a^[l997l |20lol [2011 



sion 61 PPTW (F 



and the MPI-p arallelized ver 
meaning that 



rigo & Johnson 2005), 
maps can be divided up among multiple processors with 
communication handled by the internal libraries. This 
Bayesian framework can therefore plausibly cope with 
the extreme data analysis requirements of future CMB 
or 21 cm missions. 
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